14 research outputs found

    TranSTYLer: Multimodal Behavioral Style Transfer for Facial and Body Gestures Generation

    Full text link
    This paper addresses the challenge of transferring the behavior expressivity style of a virtual agent to another one while preserving behaviors shape as they carry communicative meaning. Behavior expressivity style is viewed here as the qualitative properties of behaviors. We propose TranSTYLer, a multimodal transformer based model that synthesizes the multimodal behaviors of a source speaker with the style of a target speaker. We assume that behavior expressivity style is encoded across various modalities of communication, including text, speech, body gestures, and facial expressions. The model employs a style and content disentanglement schema to ensure that the transferred style does not interfere with the meaning conveyed by the source behaviors. Our approach eliminates the need for style labels and allows the generalization to styles that have not been seen during the training phase. We train our model on the PATS corpus, which we extended to include dialog acts and 2D facial landmarks. Objective and subjective evaluations show that our model outperforms state of the art models in style transfer for both seen and unseen styles during training. To tackle the issues of style and content leakage that may arise, we propose a methodology to assess the degree to which behavior and gestures associated with the target style are successfully transferred, while ensuring the preservation of the ones related to the source content

    Zero-Shot Style Transfer for Gesture Animation driven by Text and Speech using Adversarial Disentanglement of Multimodal Style Encoding

    Full text link
    Modeling virtual agents with behavior style is one factor for personalizing human agent interaction. We propose an efficient yet effective machine learning approach to synthesize gestures driven by prosodic features and text in the style of different speakers including those unseen during training. Our model performs zero shot multimodal style transfer driven by multimodal data from the PATS database containing videos of various speakers. We view style as being pervasive while speaking, it colors the communicative behaviors expressivity while speech content is carried by multimodal signals and text. This disentanglement scheme of content and style allows us to directly infer the style embedding even of speaker whose data are not part of the training phase, without requiring any further training or fine tuning. The first goal of our model is to generate the gestures of a source speaker based on the content of two audio and text modalities. The second goal is to condition the source speaker predicted gestures on the multimodal behavior style embedding of a target speaker. The third goal is to allow zero shot style transfer of speakers unseen during training without retraining the model. Our system consists of: (1) a speaker style encoder network that learns to generate a fixed dimensional speaker embedding style from a target speaker multimodal data and (2) a sequence to sequence synthesis network that synthesizes gestures based on the content of the input modalities of a source speaker and conditioned on the speaker style embedding. We evaluate that our model can synthesize gestures of a source speaker and transfer the knowledge of target speaker style variability to the gesture generation task in a zero shot setup. We convert the 2D gestures to 3D poses and produce 3D animations. We conduct objective and subjective evaluations to validate our approach and compare it with a baseline

    Guidelines for the use and interpretation of assays for monitoring autophagy (3rd edition)

    Get PDF
    In 2008 we published the first set of guidelines for standardizing research in autophagy. Since then, research on this topic has continued to accelerate, and many new scientists have entered the field. Our knowledge base and relevant new technologies have also been expanding. Accordingly, it is important to update these guidelines for monitoring autophagy in different organisms. Various reviews have described the range of assays that have been used for this purpose. Nevertheless, there continues to be confusion regarding acceptable methods to measure autophagy, especially in multicellular eukaryotes. For example, a key point that needs to be emphasized is that there is a difference between measurements that monitor the numbers or volume of autophagic elements (e.g., autophagosomes or autolysosomes) at any stage of the autophagic process versus those that measure fl ux through the autophagy pathway (i.e., the complete process including the amount and rate of cargo sequestered and degraded). In particular, a block in macroautophagy that results in autophagosome accumulation must be differentiated from stimuli that increase autophagic activity, defi ned as increased autophagy induction coupled with increased delivery to, and degradation within, lysosomes (inmost higher eukaryotes and some protists such as Dictyostelium ) or the vacuole (in plants and fungi). In other words, it is especially important that investigators new to the fi eld understand that the appearance of more autophagosomes does not necessarily equate with more autophagy. In fact, in many cases, autophagosomes accumulate because of a block in trafficking to lysosomes without a concomitant change in autophagosome biogenesis, whereas an increase in autolysosomes may reflect a reduction in degradative activity. It is worth emphasizing here that lysosomal digestion is a stage of autophagy and evaluating its competence is a crucial part of the evaluation of autophagic flux, or complete autophagy. Here, we present a set of guidelines for the selection and interpretation of methods for use by investigators who aim to examine macroautophagy and related processes, as well as for reviewers who need to provide realistic and reasonable critiques of papers that are focused on these processes. These guidelines are not meant to be a formulaic set of rules, because the appropriate assays depend in part on the question being asked and the system being used. In addition, we emphasize that no individual assay is guaranteed to be the most appropriate one in every situation, and we strongly recommend the use of multiple assays to monitor autophagy. Along these lines, because of the potential for pleiotropic effects due to blocking autophagy through genetic manipulation it is imperative to delete or knock down more than one autophagy-related gene. In addition, some individual Atg proteins, or groups of proteins, are involved in other cellular pathways so not all Atg proteins can be used as a specific marker for an autophagic process. In these guidelines, we consider these various methods of assessing autophagy and what information can, or cannot, be obtained from them. Finally, by discussing the merits and limits of particular autophagy assays, we hope to encourage technical innovation in the field

    The evolving SARS-CoV-2 epidemic in Africa: Insights from rapidly expanding genomic surveillance

    Get PDF
    INTRODUCTION Investment in Africa over the past year with regard to severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) sequencing has led to a massive increase in the number of sequences, which, to date, exceeds 100,000 sequences generated to track the pandemic on the continent. These sequences have profoundly affected how public health officials in Africa have navigated the COVID-19 pandemic. RATIONALE We demonstrate how the first 100,000 SARS-CoV-2 sequences from Africa have helped monitor the epidemic on the continent, how genomic surveillance expanded over the course of the pandemic, and how we adapted our sequencing methods to deal with an evolving virus. Finally, we also examine how viral lineages have spread across the continent in a phylogeographic framework to gain insights into the underlying temporal and spatial transmission dynamics for several variants of concern (VOCs). RESULTS Our results indicate that the number of countries in Africa that can sequence the virus within their own borders is growing and that this is coupled with a shorter turnaround time from the time of sampling to sequence submission. Ongoing evolution necessitated the continual updating of primer sets, and, as a result, eight primer sets were designed in tandem with viral evolution and used to ensure effective sequencing of the virus. The pandemic unfolded through multiple waves of infection that were each driven by distinct genetic lineages, with B.1-like ancestral strains associated with the first pandemic wave of infections in 2020. Successive waves on the continent were fueled by different VOCs, with Alpha and Beta cocirculating in distinct spatial patterns during the second wave and Delta and Omicron affecting the whole continent during the third and fourth waves, respectively. Phylogeographic reconstruction points toward distinct differences in viral importation and exportation patterns associated with the Alpha, Beta, Delta, and Omicron variants and subvariants, when considering both Africa versus the rest of the world and viral dissemination within the continent. Our epidemiological and phylogenetic inferences therefore underscore the heterogeneous nature of the pandemic on the continent and highlight key insights and challenges, for instance, recognizing the limitations of low testing proportions. We also highlight the early warning capacity that genomic surveillance in Africa has had for the rest of the world with the detection of new lineages and variants, the most recent being the characterization of various Omicron subvariants. CONCLUSION Sustained investment for diagnostics and genomic surveillance in Africa is needed as the virus continues to evolve. This is important not only to help combat SARS-CoV-2 on the continent but also because it can be used as a platform to help address the many emerging and reemerging infectious disease threats in Africa. In particular, capacity building for local sequencing within countries or within the continent should be prioritized because this is generally associated with shorter turnaround times, providing the most benefit to local public health authorities tasked with pandemic response and mitigation and allowing for the fastest reaction to localized outbreaks. These investments are crucial for pandemic preparedness and response and will serve the health of the continent well into the 21st century

    Gestes expressifs multimodaux avec style

    No full text
    La gĂ©nĂ©ration de gestes expressifs permet aux agents conversationnels animĂ©s (ACA) d'articuler un discours d'une maniĂšre semblable Ă  celle des humains. Le thĂšme central du manuscrit est d'exploiter et contrĂŽler l'expressivitĂ© comportementale des ACA en modĂ©lisant le comportement multimodal que les humains utilisent pendant la communication. Le but est (1) d’exploiter la prosodie de la parole, la prosodie visuelle et le langage dans le but de synthĂ©tiser des comportements expressifs pour les ACA; (2) de contrĂŽler le style des gestes synthĂ©tisĂ©s de maniĂšre Ă  pouvoir les gĂ©nĂ©rer avec le style de n'importe quel locuteur. Nous proposons un modĂšle de synthĂšse de gestes faciaux Ă  partir du texte et la parole; et entraĂźnĂ© sur le corpus TEDx que nous avons collectĂ©. Nous proposons ZS-MSTM 1.0, une approche permettant de synthĂ©tiser des gestes stylisĂ©s du haut du corps Ă  partir du contenu du discours d'un locuteur source et correspondant au style de tout locuteur cible. Il est entraĂźnĂ© sur le corpus PATS qui inclut des donnĂ©es multimodales de locuteurs ayant des styles de comportement diffĂ©rents. Il n'est pas limitĂ© aux locuteurs de PATS, et gĂ©nĂšre des gestes dans le style de n'importe quel nouveau locuteur vu ou non par notre modĂšle, sans entraĂźnement supplĂ©mentaire, ce qui rend notre approche «zero-shot». Le style comportemental est modĂ©lisĂ© sur les donnĂ©es multimodales des locuteurs - langage, gestes et parole - et indĂ©pendamment de l'identitĂ© du locuteur. Nous proposons ZS-MSTM 2.0 pour gĂ©nĂ©rer des gestes faciaux stylisĂ©s en plus des gestes du haut du corps. Ce dernier est entraĂźnĂ© sur une extension de PATS, qui inclut des actes de dialogue et des repĂšres faciaux en 2D.The generation of expressive gestures allows Embodied Conversational Agents (ECA) to articulate the speech intent and content in a human-like fashion. The central theme of the manuscript is to leverage and control the ECAs’ behavioral expressivity by modelling the complex multimodal behavior that humans employ during communication. The driving forces of the Thesis are twofold: (1) to exploit speech prosody, visual prosody and language with the aim of synthesizing expressive and human-like behaviors for ECAs; (2) to control the style of the synthesized gestures such that we can generate them with the style of any speaker. With these motivations in mind, we first propose a semantically aware and speech-driven facial and head gesture synthesis model trained on the TEDx Corpus which we collected. Then we propose ZS-MSTM 1.0, an approach to synthesize stylized upper-body gestures, driven by the content of a source speaker’s speech and corresponding to the style of any target speakers, seen or unseen by our model. It is trained on PATS Corpus which includes multimodal data of speakers having different behavioral style. ZS-MSTM 1.0 is not limited to PATS speakers, and can generate gestures in the style of any newly coming speaker without further training or fine-tuning, rendering our approach zero-shot. Behavioral style is modelled based on multimodal speakers’ data - language, body gestures, and speech - and independent from the speaker’s identity ("ID"). We additionally propose ZS-MSTM 2.0 to generate stylized facial gestures in addition to the upper-body gestures. We train ZS-MSTM 2.0 on PATS Corpus, which we extended to include dialog acts and 2D facial landmarks

    Gestes expressifs multimodaux avec style

    No full text
    The generation of expressive gestures allows Embodied Conversational Agents (ECA) to articulate the speech intent and content in a human-like fashion. The central theme of the manuscript is to leverage and control the ECAs’ behavioral expressivity by modelling the complex multimodal behavior that humans employ during communication. The driving forces of the Thesis are twofold: (1) to exploit speech prosody, visual prosody and language with the aim of synthesizing expressive and human-like behaviors for ECAs; (2) to control the style of the synthesized gestures such that we can generate them with the style of any speaker. With these motivations in mind, we first propose a semantically aware and speech-driven facial and head gesture synthesis model trained on the TEDx Corpus which we collected. Then we propose ZS-MSTM 1.0, an approach to synthesize stylized upper-body gestures, driven by the content of a source speaker’s speech and corresponding to the style of any target speakers, seen or unseen by our model. It is trained on PATS Corpus which includes multimodal data of speakers having different behavioral style. ZS-MSTM 1.0 is not limited to PATS speakers, and can generate gestures in the style of any newly coming speaker without further training or fine-tuning, rendering our approach zero-shot. Behavioral style is modelled based on multimodal speakers’ data - language, body gestures, and speech - and independent from the speaker’s identity ("ID"). We additionally propose ZS-MSTM 2.0 to generate stylized facial gestures in addition to the upper-body gestures. We train ZS-MSTM 2.0 on PATS Corpus, which we extended to include dialog acts and 2D facial landmarks.La gĂ©nĂ©ration de gestes expressifs permet aux agents conversationnels animĂ©s (ACA) d'articuler un discours d'une maniĂšre semblable Ă  celle des humains. Le thĂšme central du manuscrit est d'exploiter et contrĂŽler l'expressivitĂ© comportementale des ACA en modĂ©lisant le comportement multimodal que les humains utilisent pendant la communication. Le but est (1) d’exploiter la prosodie de la parole, la prosodie visuelle et le langage dans le but de synthĂ©tiser des comportements expressifs pour les ACA; (2) de contrĂŽler le style des gestes synthĂ©tisĂ©s de maniĂšre Ă  pouvoir les gĂ©nĂ©rer avec le style de n'importe quel locuteur. Nous proposons un modĂšle de synthĂšse de gestes faciaux Ă  partir du texte et la parole; et entraĂźnĂ© sur le corpus TEDx que nous avons collectĂ©. Nous proposons ZS-MSTM 1.0, une approche permettant de synthĂ©tiser des gestes stylisĂ©s du haut du corps Ă  partir du contenu du discours d'un locuteur source et correspondant au style de tout locuteur cible. Il est entraĂźnĂ© sur le corpus PATS qui inclut des donnĂ©es multimodales de locuteurs ayant des styles de comportement diffĂ©rents. Il n'est pas limitĂ© aux locuteurs de PATS, et gĂ©nĂšre des gestes dans le style de n'importe quel nouveau locuteur vu ou non par notre modĂšle, sans entraĂźnement supplĂ©mentaire, ce qui rend notre approche «zero-shot». Le style comportemental est modĂ©lisĂ© sur les donnĂ©es multimodales des locuteurs - langage, gestes et parole - et indĂ©pendamment de l'identitĂ© du locuteur. Nous proposons ZS-MSTM 2.0 pour gĂ©nĂ©rer des gestes faciaux stylisĂ©s en plus des gestes du haut du corps. Ce dernier est entraĂźnĂ© sur une extension de PATS, qui inclut des actes de dialogue et des repĂšres faciaux en 2D

    Transformer Network for Semantically-Aware and Speech-Driven Upper-Face Generation

    Full text link
    We propose a semantically-aware speech driven model to generate expressive and natural upper-facial and head motion for Embodied Conversational Agents (ECA). In this work, we aim to produce natural and continuous head motion and upper-facial gestures synchronized with speech. We propose a model that generates these gestures based on multimodal input features: the first modality is text, and the second one is speech prosody. Our model makes use of Transformers and Convolutions to map the multimodal features that correspond to an utterance to continuous eyebrows and head gestures. We conduct subjective and objective evaluations to validate our approach and compare it with state of the art

    Recessive Mutations in RTN4IP1 Cause Isolated and Syndromic Optic Neuropathies

    Get PDF
    Autosomal-recessive optic neuropathies are rare blinding conditions related to retinal ganglion cell (RGC) and optic-nerve degeneration, for which only mutations in TMEM126A and ACO2 are known. In four families with early-onset recessive optic neuropathy, we identified mutations in RTN4IP1, which encodes a mitochondrial ubiquinol oxydo-reductase. RTN4IP1 is a partner of RTN4 (also known as NOGO), and its ortholog Rad8 in C. elegans is involved in UV light response. Analysis of fibroblasts from affected individuals with a RTN4IP1 mutation showed loss of the altered protein, a deficit of mitochondrial respiratory complex I and IV activities, and increased susceptibility to UV light. Silencing of RTN4IP1 altered the number and morphogenesis of mouse RGC dendrites in vitro and the eye size, neuro-retinal development, and swimming behavior in zebrafish in vivo. Altogether, these data point to a pathophysiological mechanism responsible for RGC early degeneration and optic neuropathy and linking RTN4IP1 functions to mitochondrial physiology, response to UV light, and dendrite growth during eye maturation
    corecore